51 research outputs found

    Phylogenetic and structural analysis of centromeric DNA and kinetochore proteins

    Get PDF
    BACKGROUND: Kinetochores are large multi-protein structures that assemble on centromeric DNA (CEN DNA) and mediate the binding of chromosomes to microtubules. Comprising 125 base-pairs of CEN DNA and 70 or more protein components, Saccharomyces cerevisiae kinetochores are among the best understood. In contrast, most fungal, plant and animal cells assemble kinetochores on CENs that are longer and more complex, raising the question of whether kinetochore architecture has been conserved through evolution, despite considerable divergence in CEN sequence. RESULTS: Using computational approaches, ranging from sequence similarity searches to hidden Markov model-based modeling, we show that organisms with CENs resembling those in S. cerevisiae (point CENs) are very closely related and that all contain a set of 11 kinetochore proteins not found in organisms with complex CENs. Conversely, organisms with complex CENs (regional CENs) contain proteins seemingly absent from point-CEN organisms. However, at least three quarters of known kinetochore proteins are present in all fungi regardless of CEN organization. At least six of these proteins have previously unidentified human orthologs. When fungi and metazoa are compared, almost all have kinetochores constructed around Spc105 and three conserved multi-protein linker complexes (MIND, COMA, and the NDC80 complex). CONCLUSION: Our data suggest that critical structural features of kinetochores have been well conserved from yeast to man. Surprisingly, phylogenetic analysis reveals that human kinetochore proteins are as similar in sequence to their yeast counterparts as to presumptive Drosophila melanogaster or Caenorhabditis elegans orthologs. This finding is consistent with evidence that kinetochore proteins have evolved very rapidly relative to components of other complex cellular structures

    Wilms Tumor Chromatin Profiles Highlight Stem Cell Properties and a Renal Developmental Network

    Get PDF
    Wilms tumor is the most common pediatric kidney cancer. To identify transcriptional and epigenetic mechanisms that drive this disease, we compared genome-wide chromatin profiles of Wilms tumors, embryonic stem cells (ESCs), and normal kidney. Wilms tumors prominently exhibit large active chromatin domains previously observed in ESCs. In the cancer, these domains frequently correspond to genes that are critical for kidney development and expressed in the renal stem cell compartment. Wilms cells also express “embryonic” chromatin regulators and maintain stem cell-like p16 silencing. Finally, Wilms and ESCs both exhibit “bivalent” chromatin modifications at silent promoters that may be poised for activation. In Wilms tumor, bivalent promoters correlate to genes expressed in specific kidney compartments and point to a kidney-specific differentiation program arrested at an early-progenitor stage. We suggest that Wilms cells share a transcriptional and epigenetic landscape with a normal renal stem cell, which is inherently susceptible to transformation and may represent a cell of origin for this disease

    EWS-FLI1 Utilizes Divergent Chromatin Remodeling Mechanisms to Directly Activate or Repress Enhancer Elements in Ewing Sarcoma

    Get PDF
    SummaryThe aberrant transcription factor EWS-FLI1 drives Ewing sarcoma, but its molecular function is not completely understood. We find that EWS-FLI1 reprograms gene regulatory circuits in Ewing sarcoma by directly inducing or repressing enhancers. At GGAA repeat elements, which lack evolutionary conservation and regulatory potential in other cell types, EWS-FLI1 multimers induce chromatin opening and create de novo enhancers that physically interact with target promoters. Conversely, EWS-FLI1 inactivates conserved enhancers containing canonical ETS motifs by displacing wild-type ETS transcription factors. These divergent chromatin-remodeling patterns repress tumor suppressors and mesenchymal lineage regulators while activating oncogenes and potential therapeutic targets, such as the kinase VRK1. Our findings demonstrate how EWS-FLI1 establishes an oncogenic regulatory program governing both tumor survival and differentiation

    A deep learning system accurately classifies primary and metastatic cancers using passenger mutation patterns.

    Get PDF
    In cancer, the primary tumour's organ of origin and histopathology are the strongest determinants of its clinical behaviour, but in 3% of cases a patient presents with a metastatic tumour and no obvious primary. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we train a deep learning classifier to predict cancer type based on patterns of somatic passenger mutations detected in whole genome sequencing (WGS) of 2606 tumours representing 24 common cancer types produced by the PCAWG Consortium. Our classifier achieves an accuracy of 91% on held-out tumor samples and 88% and 83% respectively on independent primary and metastatic samples, roughly double the accuracy of trained pathologists when presented with a metastatic tumour without knowledge of the primary. Surprisingly, adding information on driver mutations reduced accuracy. Our results have clinical applicability, underscore how patterns of somatic passenger mutations encode the state of the cell of origin, and can inform future strategies to detect the source of circulating tumour DNA

    Evolution of pathogenicity and sexual reproduction in eight Candida genomes

    Get PDF
    Candida species are the most common cause of opportunistic fungal infection worldwide. Here we report the genome sequences of six Candida species and compare these and related pathogens and non-pathogens. There are significant expansions of cell wall, secreted and transporter gene families in pathogenic species, suggesting adaptations associated with virulence. Large genomic tracts are homozygous in three diploid species, possibly resulting from recent recombination events. Surprisingly, key components of the mating and meiosis pathways are missing from several species. These include major differences at the mating-type loci (MTL); Lodderomyces elongisporus lacks MTL, and components of the a1/2 cell identity determinant were lost in other species, raising questions about how mating and cell types are controlled. Analysis of the CUG leucine-to-serine genetic-code change reveals that 99% of ancestral CUG codons were erased and new ones arose elsewhere. Lastly, we revise the Candida albicans gene catalogue, identifying many new genes.publishe

    Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis.

    Get PDF
    Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis

    Analyses of non-coding somatic drivers in 2,658 cancer whole genomes.

    Get PDF
    The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available
    corecore